Apache Drill: Interactive Ad-Hoc Analysis at Scale.

نویسندگان

  • Michael Hausenblas
  • Jacques Nadeau
چکیده

Apache Drill is a distributed system for interactive ad-hoc analysis of large-scale datasets. Designed to handle up to petabytes of data spread across thousands of servers, the goal of Drill is to respond to ad-hoc queries in a low-latency manner. In this article, we introduce Drill's architecture, discuss its extensibility points, and put it into the context of the emerging offerings in the interactive analytics realm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision

UNLABELLED Many time-consuming analyses of next -: generation sequencing data can be addressed with modern cloud computing. The Apache Hadoop-based solutions have become popular in genomics BECAUSE OF: their scalability in a cloud infrastructure. So far, most of these tools have been used for batch data processing rather than interactive data querying. The SparkSeq software has been created to ...

متن کامل

Leveraging in-Memory Technology to Improve the Acceptance of MSS - a Managers' Perspective

Management support systems (MSS) help managers to perform their jobs more efficiently. With in-memory technology, a new IT enabler promises to support managers by benefits ranging from reducing time for MSS data entry and analysis to completing even new topics of analysis. Hence, the present situation is favorable for an MSS redesign applying in-memory apps. Such apps are fieldtested and ready-...

متن کامل

System Support for Small - Scale Auctions

Mobile wireless computing devices provide interesting opportunities for building new styles of interactive applications by exploiting ad hoc networks. The basic networking technology needed to support such applications is readily available. The paper describes the design and implementation of a small scale auction application in an ad hoc network. It is important to develop a variety of interac...

متن کامل

Ad Hoc Synchronization Considered Harmful

Many synchronizations in existing multi-threaded programs are implemented in an ad hoc way. The first part of this paper does a comprehensive characteristic study of ad hoc synchronizations in concurrent programs. By studying 229 ad hoc synchronizations in 12 programs of various types (server, desktop and scientific), including Apache, MySQL, Mozilla, etc., we find several interesting and perha...

متن کامل

Piranha: Optimizing Short Jobs in Hadoop

Cluster computing has emerged as a key parallel processing platform for large scale data. All major internet companies use it as their major central processing platform. One of cluster computing’s most popular examples is MapReduce and its open source implementation Hadoop. These systems were originally designed for batch and massive-scale computations. Interestingly, over time their production...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Big data

دوره 1 2  شماره 

صفحات  -

تاریخ انتشار 2013